97 research outputs found

    Third International Workshop on Gamification for Information Retrieval (GamifIR'16)

    Get PDF
    Stronger engagement and greater participation is often crucial to reach a goal or to solve an issue. Issues like the emerging employee engagement crisis, insufficient knowledge sharing, and chronic procrastination. In many cases we need and search for tools to beat procrastination or to change people’s habits. Gamification is the approach to learn from often fun, creative and engaging games. In principle, it is about understanding games and applying game design elements in a non-gaming environments. This offers possibilities for wide area improvements. For example more accurate work, better retention rates and more cost effective solutions by relating motivations for participating as more intrinsic than conventional methods. In the context of Information Retrieval (IR) it is not hard to imagine that many tasks could benefit from gamification techniques. Besides several manual annotation tasks of data sets for IR research, user participation is important in order to gather implicit or even explicit feedback to feed the algorithms. Gamification, however, comes with its own challenges and its adoption in IR is still in its infancy. Given the enormous response to the first and second GamifIR workshops that were both co-located with ECIR, and the broad range of topics discussed, we now organized the third workshop at SIGIR 2016 to address a range of emerging challenges and opportunities

    The accessibility dimension for structured document retrieval

    Get PDF
    Structured document retrieval aims at retrieving the document components that best satisfy a query, instead of merely retrieving pre-defined document units. This paper reports on an investigation of a tf-idf-acc approach, where tf and idf are the classical term frequency and inverse document frequency, and acc, a new parameter called accessibility, that captures the structure of documents. The tf-idf-acc approach is defined using a probabilistic relational algebra. To investigate the retrieval quality and estimate the acc values, we developed a method that automatically constructs diverse test collections of structured documents from a standard test collection, with which experiments were carried out. The analysis of the experiments provides estimates of the acc values

    GamifIR 2016: SIGIR 2016 Workshop on Gamification for Information Retrieval

    Get PDF
    The third workshop on Gamification for Information Retrieval (GamifIR) took place on the 21th of July 2016 in conjunction with SIGIR 2016 in Pisa, Italy. It was the first GamifIR held in conjunction with the SIGIR, the first and second GamifIR workshops were both colocated with ECIR. The workshop program included one invited keynote presentation, seven paper presentations and a discussion session. The keynote presentation stated the necessity of proper theory for gamification design and resulting opportunities. The paper presentation covered studies on diverse areas and approaches for the application of gamification

    A Personalised Reader for Crowd Curated Content

    No full text
    Personalised news recommender systems traditionally rely on content ingested from a select set of publishers and ask users to indicate their interests from a predefined list of top- ics. They then provide users a feed of news items for each of their topics. In this demo, we present a mobile app that automatically learns users’ interests from their browsing or twitter history and provides them with a personalised feed of diverse, crowd curated content. The app also continuously learns from the users’ interactions as they swipe to like or skip items recommended to them. In addition, users can discover trending stories and content liked by other users they follow. The crowd is thus formed of the users, who as a whole act as the curators of the content to be recommended

    On the Social and Technical Challenges of Web Search Autosuggestion Moderation

    Full text link
    Past research shows that users benefit from systems that support them in their writing and exploration tasks. The autosuggestion feature of Web search engines is an example of such a system: It helps users in formulating their queries by offering a list of suggestions as they type. Autosuggestions are typically generated by machine learning (ML) systems trained on a corpus of search logs and document representations. Such automated methods can become prone to issues that result in problematic suggestions that are biased, racist, sexist or in other ways inappropriate. While current search engines have become increasingly proficient at suppressing such problematic suggestions, there are still persistent issues that remain. In this paper, we reflect on past efforts and on why certain issues still linger by covering explored solutions along a prototypical pipeline for identifying, detecting, and addressing problematic autosuggestions. To showcase their complexity, we discuss several dimensions of problematic suggestions, difficult issues along the pipeline, and why our discussion applies to the increasing number of applications beyond web search that implement similar textual suggestion features. By outlining persistent social and technical challenges in moderating web search suggestions, we provide a renewed call for action.Comment: 17 Pages, 4 images displayed within 3 latex figure

    Rethinking Semi-supervised Learning with Language Models

    Get PDF
    Semi-supervised learning (SSL) is a popular setting aiming to effectively utilize unlabelled data to improve model performance in downstream natural language processing (NLP) tasks. Currently, there are two popular approaches to make use of unlabelled data: Self-training (ST) and Task-adaptive pre-training (TAPT). ST uses a teacher model to assign pseudo-labels to the unlabelled data, while TAPT continues pre-training on the unlabelled data before fine-tuning. To the best of our knowledge, the effectiveness of TAPT in SSL tasks has not been systematically studied, and no previous work has directly compared TAPT and ST in terms of their ability to utilize the pool of unlabelled data. In this paper, we provide an extensive empirical study comparing five state-of-the-art ST approaches and TAPT across various NLP tasks and data sizes, including in- and out-of-domain settings. Surprisingly, we find that TAPT is a strong and more robust SSL learner, even when using just a few hundred unlabelled samples or in the presence of domain shifts, compared to more sophisticated ST approaches, and tends to bring greater improvements in SSL than in fully-supervised settings. Our further analysis demonstrates the risks of using ST approaches when the size of labelled or unlabelled data is small or when domain shifts exist. We offer a fresh perspective for future SSL research, suggesting the use of unsupervised pre-training objectives over dependency on pseudo labels

    Third International Workshop on Gamification

    Get PDF
    Stronger engagement and greater participation is often crucial to reach a goal or to solve an issue. Issues like the emerging employee engagement crisis, insufficient knowledge sharing, and chronic procrastination. In many cases we need and search for tools to beat procrastination or to change people’s habits. Gamification is the approach to learn from often fun, creative and engaging games. In principle, it is about understanding games and applying game design elements in a non-gaming environments. This offers possibilities for wide area improvements. For example more accurate work, better retention rates and more cost effective solutions by relating motivations for participating as more intrinsic than conventional methods. In the context of Information Retrieval (IR) it is not hard to imagine that many tasks could benefit from gamification techniques. Besides several manual annotation tasks of data sets for IR research, user participation is important in order to gather implicit or even explicit feedback to feed the algorithms. Gamification, however, comes with its own challenges and its adoption in IR is still in its infancy. Given the enormous response to the first and second GamifIR workshops that were both co-located with ECIR, and the broad range of topics discussed, we now organized the third workshop at SIGIR 2016 to address a range of emerging challenges and opportunities

    INEX 2006 Evaluation Measures

    Get PDF
    International audienceThis paper describes the official measures of retrieval effectiveness employed at the ad hoc track of INEX 2006

    On Aggregating Labels from Multiple Crowd Workers to Infer Relevance of Documents

    Full text link
    Abstract. We consider the problem of acquiring relevance judgements for in-formation retrieval (IR) test collections through crowdsourcing when no true relevance labels are available. We collect multiple, possibly noisy relevance la-bels per document from workers of unknown labelling accuracy. We use these labels to infer the document relevance based on two methods. The first method is the commonly used majority voting (MV) which determines the document relevance based on the label that received the most votes, treating all the work-ers equally. The second is a probabilistic model that concurrently estimates the document relevance and the workers accuracy using expectation maximization (EM). We run simulations and conduct experiments with crowdsourced rele-vance labels from the INEX 2010 Book Search track to investigate the accuracy and robustness of the relevance assessments to the noisy labels. We observe the effect of the derived relevance judgments on the ranking of the search systems. Our experimental results show that the EM method outperforms the MV method in the accuracy of relevance assessments and IR systems ranking. The performance improvements are especially noticeable when the number of labels per document is small and the labels are of varied quality.
    • …
    corecore